Search Results: "Tim Retout"

23 August 2014

Tim Retout: Packaging pump.io for Debian

I intend to intend to package pump.io for Debian. It's going to take a long time, but I don't know whether that's weeks or years yet. The world needs decentralized social networking. I discovered the tools that let me create this wiki summary of the progress in pump.io packaging. There are at least 35 dependencies that need uploading, so this would go a lot faster if it weren't a solo effort - if anyone else has some time, please let me know! But meanwhile I'm hoping to build some momentum. I think it's important to keep the quality of the packaging as high as possible, even while working through so many. It would cost a lot of time later if I had to go back and fix bugs in everything. I really want to be running the test suites in these builds, but it's not always easy. One of the milestones along the way might be packaging nodeunit. Nodeunit is a Nodejs unit testing framework (duh), used by node-bcrypt (and, unrelatedly, statsd, which would be pretty cool to have in Debian too). Last night I filed eight pull requests to try and fix up copyright/licensing issues in dependencies of nodeunit. Missing copyright statements are one of the few things I can't fix by myself as a packager. All I can do is wait, and package other dependencies in the meantime. Fortunately there are plenty of those. And I have not seen so many issues in direct dependencies of pump.io itself - or at least they've been fixed in git.

Tim Retout: Packaging pump.io for Debian

25 July 2014

Tim Retout: London.pm's July 2014 tech meeting

Last night, I went to the London.pm tech meeting, along with a couple of colleagues from CV-Library. The talks, combined with the unusually hot weather we're having in the UK at the moment, combined with my holiday all last week, make it feel like I'm at a software conference. :) The highlight for me was Thomas Klausner's talk about OX (and AngularJS). We bought him a drink at the pub later to pump him for information about using Bread::Board, with some success. It was worth the long, late commute back to Southampton. All very enjoyable, and I hope they have more technical meetings soon. I'm planning to attend the London Perl Workshop later in the year.

22 July 2014

Tim Retout: Cowbuilder and Tor

You've installed apt-transport-tor to help prevent targeted attacks on your system. Great! Now you want to build Debian packages using cowbuilder, and you notice these are still using plain HTTP. If you're willing to fetch the first few packages without using apt-transport-tor, this is as easy as:

Add 'EXTRAPACKAGES="apt-transport-tor"' to your pbuilderrc.
Run 'cowbuilder --update'
Set 'MIRRORSITE=tor+http://http.debian.net/debian' in pbuilderrc.
Run 'cowbuilder --update' again.

Now any future builds should fetch build-dependencies over Tor. Unfortunately, creating a base.cow from scratch is more problematic. Neither 'debootstrap' nor 'cdebootstrap' actually rely on apt acquire methods to download files - they look at the URL scheme themselves to work out where to fetch from. I think it's a design point that they shouldn't need apt, anyway, so that you can debootstrap on non-Debian systems. I don't have a good solution beyond using some other means to route these requests over Tor.

21 July 2014

Tim Retout: apt-transport-tor 0.2.1

apt-transport-tor 0.2.1 should now be on your preferred unstable Debian mirror. It will let you download Debian packages through Tor. New in this release: support for HTTPS over Tor, to keep up with people.debian.org. :) I haven't mentioned it before on this blog. To get it working, you need to "apt-get install apt-transport-tor", and then use sources.list lines like so:

deb tor+http://http.debian.net/debian unstable main

Note the use of http.debian.net in order to pick a mirror near to whichever Tor exit node. Throughput is surprisingly good. On the TODO list: reproducible builds? It would be nice to have some mirrors offer Tor hidden services, although I have yet to think about the logistics of this, such as how the load could be balanced (maybe a service like http.debian.net). I also need to look at how cowbuilder etc. can be made to play nicely with Tor. And then Debian installer support!

7 June 2014

Tim Retout: Day of Action

Today I attended the Don't Spy On Us campaign's Day Of Action at Shoreditch Town Hall in London. I'm not sure how much actual action there was, but the talking was interesting. Retrospective Day of Action drinking game: drink every time you hear the phrase "If you have nothing to hide, you have nothing to fear." The spooks have a really good marketing department. I don't write a lot on the internet any more - something I regret, actually. It can't even be because there are alternative places I am writing, because over the last couple of years I have been closing most of my social media accounts. I just share much less of myself online. On the internet, nothing is ephemeral. Bruce Schneier says so. Choose your words carefully. The thing about blogging is that it's so public. It's often tied to the writer's real-life identity. One of the valuable things about social media services is that they supposedly let you restrict who can read your words - the trade-off being that you must also grant access to advertisers, spies, cybercriminals... Most memorable moments of the day:

Alan Rusbridger contrasting how he felt as a journalist working on the Snowden stories when in the US versus the UK - when in the UK, it's like "you know something bad is going to happen".
Being asked to leave a tiny meeting room, in order that I could show my ticket for the next talk, only to retake the seat where I had left my bag. Go bureaucracy!
Shami Chakrabarti leaving in her sunglasses like a rockstar.

In general, there were lots of famous people walking around as if they were normal. I was in the same room as Cory Doctorow, Jimmy Wales and Bruce Schneier at the same time. Ahem, action (mostly for UK citizens):

Sign the petition.
Join the various groups who organized today (Open Rights Group, Liberty, Big Brother Watch, Privacy International, Article 19, English PEN).
Ask to meet your MP to talk about privacy at their next surgery - you can phone the House of Commons.

15 February 2014

Tim Retout: Backporting some Perl modules

I've started backporting some Perl modules to wheezy-backports - for starters, libbread-board-perl, which is now waiting in BACKPORTS-NEW. At work I've recently been trying to automate the deployment of our platform, and was originally trying to use Carton to manage the CPAN dependencies for us. It seems like it ought to be possible to make this work using CPAN-only tools. However, in practice, I've seen two strong negatives with this approach:

it's a lot of work for developers to manage the entire dependency chain, and
it takes forever to get the environment running.

Consider, when you spin up a fresh VM, you need to build Perl from source, and then compile every CPAN module you depend on. This includes all the modules needed to run all the test suites. That's not going to be fast. All that, and you still need a solution that works with the distro's package management, because you still need to install all the build dependencies. So, I'm trying a new approach - if someone else benefits from the packages I backport, even better.

6 February 2014

Tim Retout: FOSDEM 2014

I attended FOSDEM this year. As always, it was very busy, and the Brussels transport system was as confusing as ever. This time it was nice to accidentally bump into so many people I know from years past. Lunar's talk on reproducible builds of Debian packages was interesting - being able to independently verify that a particular binary package was built from a particular source package is quite attractive. Also Mailpile declared an alpha release. The bit in the talk that I didn't know already was the description of exactly how the search function works. Seeing how they integrate GPG into the contacts/compose features brought home how lacking most (all?) other mail clients are when it comes to usable encryption. On Sunday afternoon I managed to grab a seat in the Go devroom, to hear about crazy things like how YouTube are putting a daemon called Vitess in front of MySQL to abstract away sharding (at the cost of some transaction guarantees). You would have thought Google would already have a scalable data store of some sort? Other bits I remember: Michael Meeks talking about GPU-enabling spreadsheet formulae calculations. And hearing Wietse Venema talk about Postfix was pretty awesome.

2 January 2014

Tim Retout: OpenVPN and easy-rsa

One of those enlightenment moments that I should have had sooner: every time I have seen someone set up an OpenVPN VPN, they have generated all the certificates on the VPN server as root using easy-rsa. This is kind of strange, because you end up with an incredibly sensitive directory on the VPN server containing every private key for every client. Another angle is whether you trust the random number generators used to create all these keys - does your hosting provider use a weak RNG? Instead, you could set up your CA using easy-rsa on a separate machine - perhaps even air-gapped. Then private keys can be generated on each machine that wants to join the VPN, and the certificates can get signed by the CA and returned. (The easy-rsa package has been split out of the openvpn package in Debian unstable, which makes this more understandable.) Is there a security benefit? You could argue that if your VPN server has been compromised, then you are already in trouble. But I'm thinking about a setup where I could run multiple VPN servers for redundancy, signed by the same CA - then if one server gets broken into, you could kill it without having to revoke all the client keys. By the way, the default RSA key size used by easy-rsa is 1024 bits at the time of writing (fixed upstream: Debian bug #733905). This is simple to change, but you need to know to do it. One of the 30c3 lightning talks was about bettercrypto.org - a guide to which cryptography settings to choose for commonly used software.

1 January 2014

Tim Retout: 2014

So, happy new year. :) I watched many 30c3 talks via the streams over Christmas - they were awesome. I especially enjoyed finding out (in the Tor talk) that the Internet Watch Foundation need to use Tor when checking out particularly dodgy links online, else people just serve them up pictures of kittens. Today's fail: deciding to set up OpenVPN, then realising the OpenVZ VPS I was planning to use would not support /dev/net/tun. I'm back at work tomorrow, preparing for the January surge of people looking for jobs. Tonight, the first Southampton Perl Mongers meeting of the year.

2 December 2013

Tim Retout: How not to parse search queries

While I remember, I have uploaded the slides from my talk about Solr and Perl at the London Perl Workshop. This talk was inspired by having seen and contributed to at least five different sets of Solr search code at my current job, all of which (I now believe) were doing it wrong. I distilled this hard-won knowledge into a 20 minute talk, which - funny story - I actually delivered twice to work around a cock-up in the printed schedule. I don't believe any video was successfully taken, but I may be proved wrong later. I have also uploaded the Parse::Yapp grammar mentioned in the talk. In case you don't have time to read the slides, the right way to present Solr via Perl is to use the 'edismax' parser, and write your code a bit like this:

my $solr = WebService::Solr->new($url);
my $s = $query->param('q');
# WebService::Solr::Query objects are useful for
# 'fq' params, but avoid them for main 'q' param.
my $options =  
 fq => [WebService::Solr::Query->new(...)];
 ;
$solr->search($s, \%options);

The key thing here is not to put any complicated parsing code in between the user and Solr. Avoid Search::QueryParser at all costs.

Tim Retout: Questhub.io

At the London Perl Workshop last Saturday, one of the lightning talks was about Questhub.io, formerly known as "play-perl.org". It's social gamification for your task list, or something like that. Buzzword-tastic! But most importantly, there seems to be a nice community of programming types to procrastinate with you on your quests. This means I can finally get to work refuting lamby's prediction about gamification of Debian development! Tasks are referred to as "Quests", and are pursued in themed "Realms", for that World of Warcraft feeling. For example, there's a "Perl" realm, and a "Lisp" realm, and a "Haskell" realm, but also non-programming realms like "Fitness" and "Japanese". Of course, part of me now wants to construct a federated version which can be self-hosted. :) Another downside of questhub currently is the lack of SSL support - your session cookies are sent in plain text. I hope this changes soon.

16 June 2013

Tim Retout: Sophie

It's my first Father's Day! Sophie was born 2 months ago (3345g or 7lb 6oz), and I've been on a blogging hiatus for quite a bit longer than that. She's very cute.

I am getting into the swing of fatherhood - lots of nappy changing. :) I took my two weeks of paternity leave, but spread the second "week" over two weeks by working just afternoons, which gave me lots of time with mummy and baby. We watched a DVD called "The Happiest Baby on the Block", and mastered the techniques therein (mainly swaddling and white noise). So all things considered, we're getting quite a bit of sleep. Sophie is very curious about my typing, and leans towards anything she's interested in... so she's currently suspended at an angle besides me. Maybe she'll be interested in what her parents do, when she grows up. :) But for now, we're enjoying that she's learned to smile.

3 January 2013

Tim Retout: New Year

Another year. 2012 was busy - I got moved house twice, changed jobs, and got married. In 2013, I should become a father, fingers crossed (due mid-April). Change is a familiar friend now. I just listened to Tom Armitage speaking about coding on Radio 4 - I /think/ the podcast mp3 link will work for people outside the UK, but the iPlayer probably won't. If you can get hold of it, it's worth the 20 minutes of your time. If I had to make a New Year's resolution, it would be to listen to more Radio 4 - there's such a lot of it, though. I'm going to try subscribing to some of their podcasts and listening to them on my commute - timeshifting some of the best bits. Might work.

21 December 2012

Tim Retout: Perl Forking, Reference Counting and Copy-on-Write

I have been dealing with an interesting forking issue at work. It happens to involve Perl, but don't let that put you off. So, suppose you need to perform an I/O-bound task that is eminently parallelizable (in our case, generating and sending lots of emails). You have learnt from previous such attempts, and broken out Parallel::Iterator from CPAN to give you easy fork()ing goodness. Forking can be very memory-efficient, at least under the Linux kernel, because pages are shared between the parent and the children via a copy-on-write system. Further suppose that you want to generate and share a large data structure between the children, so that you can iterate over it. Copy-on-write pages, should be cheap, right?

my $large_array_ref = get_data();
my $iter = iterate( sub  
    my $i = $_[1];
    my $element = $large_array_ref->[$i];
    ...
 , [0..1000000] );

Sadly, when you run your program, it gobbles up memory until the OOM killer steps in. Our first problem was that the system malloc implementation was less good for this particular task than Perl's built-in malloc. Not a problem, we were using perlbrew anyway, so a quick few experimental rebuilds later and this was solved. More interesting was the slow, 60MB/s leak that we saw after that. There were no circular references, and everything was going out of scope at the end of the function, so what was happening? Recall that Perl uses reference counting to track memory allocation. In the children, because we took a reference to an element of the large shared data structure, we were effectively writing to the relevant page in memory, so it would get copied. Over time, as we iterated through the entire structure, the children would end up copying almost every page! This would double our memory costs. (We confirmed the diagnosis using 'smem', incidentally. Very useful.) The copy-on-write semantics of fork() do not play well with reference-counted interpreted languages such as Perl or CPython. Apparently a similar issue occurs with some mark-and-sweep garbage-collection implementations - but Ruby 2.0 is reputed to be COW-friendly. All was not lost, however - we just needed to avoid taking any references! Implement a deep copy that does not involve saving any intermediate variables along the way. This can be a bit long-winded, but it works.

my $large_array_ref = get_data();
my $iter = iterate( sub  
    my $i = $_[1];
    my %clone;
    $clone id   = $large_array_ref->[$i] id ;
    $clone foo  = $large_array_ref->[$i] foo ;
    ...
 , [0..1000000] );

This could be improved if we wrote an XS CPAN module that cloned data structures without incrementing any reference counts - I presume this is possible. We tried the most common deep-copy modules from CPAN, but have not yet found one that avoids reference counting. This same problem almost certainly shows up when using the Apache prefork MPM and mod_perl - even read-only global variables can become unshared. I would be very interested to learn of any other approaches people have found to solve this sort of problem - do email me.

27 October 2012

Tim Retout: Recruiting

On Monday, I need to start hiring a Perl programmer - or, at least, a programmer willing to write Perl. I work for a website where people post their CVs, which tends to help - although this will mean that my boss wants me to do it without going through recruiters. Which is fine. I just have to use the search interface that recruiters normally use. And looking through all these CVs, it dawned on me that I don't have a clue whether any of the people are suitable for the job. I have to search for keywords that we think might be relevant - "Perl", I guess - and then sort through the hundreds of people who come back from the search. It's very painful, because you can't really judge a CV without reading it - and even that won't necessarily tell you the important things about that person. Do they actually write good code? Do they work well in a team? When searching for a piece of information, you probably need just one website to answer your question; when searching for job candidates, I guess you need to see a range of CVs. And then you need to interview them; this could take weeks. Sucks to be me.

6 October 2012

Tim Retout: Wedding

Today, Kate and I got married! Thank you to everyone who sent best wishes. A big wedding party will follow in the next 18 months or so (when we've saved some money!), to which many more people will be invited. This was the minimum viable subset of wedding - we got the product to market early, and both stakeholders are very satisfied. We had dinner at the Caribbean restaurant in town which is always busy - turns out there's a reason for that. They do these one-pot meals in enamel dishes, which tasted amazing. It is a strange new feeling to be a husband. :)

3 October 2012

Petter Reinholdtsen: Why is your local library collecting the "wrong" computer books?

I just read the blog post from Tim Retout about the computer science book collection available in his local library, and just wanted to share my comment on his theory about computer books becoming obsolete so soon. That is part of the reason why the selection is so sad in almost any local library (it is in mine too), but I believe the major contributing factor is that the people buying books to the library have no way to know a good and future computer classic from trash. And they need to know which one will become a classic in the future, as they would normally buy one of the recently published books. During my university years, I worked for a while at the university library, and even there the person in charge of buying computer related books (and in fact any natural science related book), did not know enough about computers to make a good educated guess. Once, just before Christmas, they had some leftover money on the book budget and I was asked if I could pick out a lot of computer books in the university book store, for the library to buy for their collection. I had a great time picking all the books I dreamt of buying and reading, and the books I knew were classics (like most of the Stevens collection). I picked several of the generic O'Reilly books (ie documenting protocols, formats and systems, not specific versions of products) and stayed away from the 'teach yourself X in N days' class. I had a great time, and probably picked out more than a hundred books for the library that evening. The sad fact is that there is no way a overworked librarian is going to know that for example The Practice of Programming is a must-have in any computer library, and they will most of the time end up picking the wrong books to buy. Perhaps you can help your local library make better choices by giving the suggestions for books to get? I know they would love to hear from you, even if their budget might block them from getting your favourite book right away.

2 October 2012

Tim Retout: The Library - a challenge

I visited my local library at the weekend, on a whim. (The weekend before, I'd been to the British Library for the first time, so I guess this inspired me.) The computing sections at public libraries do not tend to inspire me, on the whole. Southampton's is actually relatively good - there are four tall bookcases assigned to computing, although one is introductory IT (Word and Excel), and another seems to be assigned to graphic design (Photoshop). Still, the two in the middle do have some programming texts - you just have to work around "Red Hat Linux 7 for Dummies" and the like. I think the trouble is, computing books become obsolete so quickly; you don't get point releases of Shakespeare every six months. I also suspect the low demand for serious computing texts creates a vicious circle, where the type of people who might want that sort of thing know better than to look in the library for it. It got me thinking - could this be changed? Could I change it?

Tim Retout: Enscript 1.6.6

The other day, I released GNU Enscript 1.6.6. You should all go and send me bug reports. It's basically the same as the 1.6.5.90 release, but more official. (I'm bored of the long version numbers - maybe I ought to knock a decimal point off.)

Next.

Previous.